Shared Task: Crowdsourced Accessibility Elicitation of Wikipedia Articles
نویسندگان
چکیده
Mechanical Turk is useful for generating complex speech resources like conversational speech transcription. In this work, we explore the next step of eliciting narrations of Wikipedia articles to improve accessibility for low-literacy users. This task proves a useful test-bed to implement qualitative vetting of workers based on difficult to define metrics like narrative quality. Working with the Mechanical Turk API, we collected sample narrations, had other Turkers rate these samples and then granted access to full narration HITs depending on aggregate quality. While narrating full articles proved too onerous a task to be viable, using other Turkers to perform vetting was very successful. Elicitation is possible on Mechanical Turk, but it should conform to suggested best practices of simple tasks that can be completed in a streamlined workflow.
منابع مشابه
Crowdsourced Accessibility: Elicitation of Wikipedia Articles
Mechanical Turk is useful for generating complex speech resources like conversational speech transcription. In this work, we explore the next step of eliciting narrations of Wikipedia articles to improve accessibility for low-literacy users. This task proves a useful test-bed to implement qualitative vetting of workers based on difficult to define metrics like narrative quality. Working with th...
متن کاملRanking Automatically Generated Questions as a Shared Task
We propose a shared task for question generation: the ranking of reading comprehension questions about Wikipedia articles generated by a base overgenerating system. This task focuses on domain-general issues in question generation and invites a variety of approaches, and also permits semi-automatic evaluation. We describe an initial system we developed for this task, and an annotation scheme us...
متن کاملBUCC Shared Task: Cross-Language Document Similarity
We summarise the organisation and results of the first shared task aimed at detecting the most similar texts in a large multilingual collection. The dataset of the shared was based on Wikipedia dumps with interlanguage links with further filtering to ensure comparability of the paired articles. The eleven system runs we received have been evaluated using the TREC evaluation metrics.
متن کاملOverview of the 2nd International Competition on Wikipedia Vandalism Detection
The paper overviews the vandalism detection task of the PAN’11 competition. A new corpus is introduced which comprises about 30 000 Wikipedia edits in the languages English, German and Spanish as well as the necessary crowdsourced annotations. Moreover, the performance of three vandalism detectors is evaluated and compared to those of the PAN’10 competition. Vivien Petras and Paul Clough (Eds.)...
متن کاملCrowdsourcing elicitation data for semantic typologies
In semantic typology, it is desirable to have quick and easy access to crosslinguistic elicitations describing stimuli from a semantic domain. We explore the use of crowdsourcing for obtaining such data, and compare it with fieldwork data obtained through in-person elicitations. Despite potential concerns about the quality of crowdsourced data, we find no difference in the amount of between-lan...
متن کامل